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We apply random matrix theory to derive spectral density of large sample covariance matrices gen- 
r-j- erated by multivariate VMA(q), VAR(q) and VARMA(gi, 52) processes. In particular, we consider 

a limit where the number of random variables N and the number of consecutive time measurements 
T are large but the ratio N/T is fixed. In this regime the underlying random matrices are asymp- 
totically equivalent to Free Random Variables (FRV). We apply the FRV calculus to calculate the 
eigenvalue density of the sample covariance for several VARMA-type processes. We explicitly solve 
the VARMA(1, 1) case and demonstrate a perfect agreement between the analytical result and the 
spectra obtained by Monte Carlo simulations. The proposed method is purely algebraic and can be 
r*j ■ easily generalized to q± > 1 and 92 > 1- 
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C\j ■ I. INTRODUCTION 

o 
o 

Vector auto-regressive (VAR) models play an important role in contemporary macro-economics, being an example 
of an approach called the "dynamic stochastic general equilibrium" (DSGE), which is superseding traditional large- 
scale macro-econometric forecasting methodologies [l|. The motivation behind them is based on the assertion that 
/\i ' more recent values of a variable are more likely to contain useful information about its future movements than the 
older ones. On the other hand, a standard tool in multivariate time series analysis is vector moving average (VMA) 
models, which is really a linear regression of the present value of the time series w.r.t. the past values of a white 
noise. A broader class of stochastic processes used in macro-economy comprises both these kinds together in the 
form of vector auto-regressive moving average (VARMA) models. These methodologies can capture certain spatial 
and temporal structures of multidimensional variables which are often neglected in practice; including them not only 
results in more accurate estimation, but also leads to models which are more interprctable. They are widely used by 
academia and central banks (cf. the European Central Bank's Smets-Wouters model for the euro zone Q), as they 
constitute quite a simple version of the DSGE equations. 

VARMA models are constructed from a number of univariate ARMA (Box-Jenkins; see for example Q) processes, 
typically coupled with each other. In this paper, we investigate only a significantly simplified circumstance when there 
is no coupling between the many ARMA components. One may argue that this is too far fetched and will be of no use 
in describing an economic reality. However, one may also treat it as a "zeroth-order hypothesis," analogously to the 
idea of [U, HI m finance, namely that the case with no cross-covariances is considered theoretically, and subsequently 
compared to some real-world data modeled by a VARMA process; any discrepancy between the two will reflect 
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nontrivial cross-covariances present in the system, thus permitting their investigation. This latter route is taken in 
this communication. 

A challenging and yet increasingly important problem is the estimation of large covariance matrices generated by 
these stationary VARMA(gi, 52) processes, since high dimensionality of the data as compared to the sample size is 
quite common in many statistical problems (the "dimensionality curse"). Therefore, an appropriate "noise cleaning" 
procedure has to be implemented, and random matrix theory (RMT) provides a natural and efficient outfit for 
doing that. In particular, the mean spectral densities (a.k.a. "limiting spectral distributions," LSD) of the Pearson 
estimators of the cross-covariances for the VMA(l) and VAR(l) models, in the relevant high-dimensionality sector 
and under the full decoupling, have been derived in Q by applying the framework proposed by Q. 

In this paper, we suggest that such calculations can be considerably simplified by resorting to a mathematical 
conce pt o f the free random variables (FRV) calculus 0, Q, succinctly introduced in sec. |TTJ Our general FRV for- 
mula [lCf allows not only to rediscover, which much less strain, the two fourth-order polynomial equations obtained 
in Q in the VMA(l) and VAR(l) cases, but also to derive a sixth-order equation (|45p which produces the mean 
spectral density for a more involved VARMA(1,1) model. The results are verified by numerical simulations, which 
show a perfect agreement. This is all done in sec. IIIII 



II. DOUBLY CORRELATED WISHART ENSEMBLES AND FREE RANDOM VARIABLES 



A. Doubly Correlated Wishart Ensembles 



1. Correlated Gaussian Random Variables 



VARMA(gi, ^2) stochastic processes, as we will see below, fall within quite a general set-up encountered in many 
areas of science where a probabilistic nature of multiple degrees of freedom evolving in time is relevant, for example, 
multivariate time series analysis in finance, applied macro-econometrics and engineering. To describe this framework, 
consider a situation of N time-dependent random variables which are measured at T consecutive time moments 
(separated by some time interval St); let Yi a be the value od the i-th (i = 1, . . . , N) random number at the a-th time 
moment (a = I, . . . ,T); together, they make up a rectangular N x T matrix Y. In what usually would be the first 
approximation, each Yi a is supposed to be drawn from a Gaussian probability distribution. We will also assume that 
they have mean values zero, (Yi a ) = 0. These degrees of freedom may in principle display mutual correlations. A set of 
correlated zero-mean Gaussian numbers is fully characterized by the two-point covariance function, Ci a .jb = (YiaYjb) 
if the underlying stochastic process generating these numbers is stationary. Linear stochastic processes, including 
VARMA(<7i, (72), belong to this category. We will restrict our attention to an even narrower class where the cross- 
correlations between different variables and the auto-correlations between different time moments are factorized, 
i.e., 

(Y ia Y jb ) = C^Aat. (1) 

In this setting, the inter-variable covariances do not change in time (and are described by an N x TV cross-covariance 
matrix C), and also the temporal covariances are identical for all the numbers (and are included in a T x T auto- 
covariance matrix A; both these matrices are symmetric and positive-definite). The Gaussian probability measure 
with this structure of covariances is known from textbooks, 

1 / 1 N T \ 

P C . G .(Y)DY = —-exp -2 E E Y ™ l C ~% Y ^ [A^L DY = 
c °- \ ij=l 0,6=1 / 



c.G. 



■ exp ( -^TrYTC^YA- 1 ) DY, (2) 



where the normalization constant A4.G. = (27r) iVT / 2 (DctC) T / 2 (DetA) JV / 2 , and the integration measure 

DY = IliLl lla=l dY ™i While the letterS " C - G -" St£Lnd for 

"correlated Gaussian." 
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Now, a standard way to approach correlated Gaussian random numbers is to recall that they can always be 
decomposed as linear combinations of uncorrelated Gaussian degrees of freedom; indeed, this is achieved through the 
transformation 

Y = VCYVA, which yields P G .(Y)DY = cxp ^-^TrY 1 ^ DY, (3) 

where the square roots of the covariance matrices, necessary to facilitate the transition, exist due to the positive- 
definiteness of C and A; the new normalization reads A/g. = (2t:) nt ^ 2 . 



2. Estimating Equal-Time Cross- C'ovariances 



An essential problem in multivariate analysis is to determine (estimate) the covariance matrices C and A from 
given A~ time series of length T of the realizations of our random variables Yi a . For simplicity, we do not distinguish in 
notation between random numbers, i.e., the population, and their realizations in actual experiments, i.e., the sample. 
Since the realized cross-covariance between degrees i and j at the same time a is Yi a Yj a , the simplest method to 
estimate the today's cross-covariance Cy is to compute the time average, 

1 T 1 1 ~ ~ 

Ci 3 - f E% i-e, c = -YY T = -\/CYAY T Vc. (4) 

a=l 

This is usually named the "Pearson estimator", up to the prefactor which depending on the context is 1/(T — 1) or 
l/T. Other estimators might be introduced, such as between distinct degrees of freedom at separate time moments 
( "time delayed estimators" ) , or with certain decreasing weights given to older measurements to reflect their growing 
obsolescence ("weighted estimators"), but we will not investigate them in this article. Furthermore, in the last equality 
in (HI, we cast c through the uncorrelated Gaussian numbers contained in Y, the price to pay for this being that 
the covariance matrices now enter into the expression for c, making it more complicated; this will be the form used 
hereafter. The random matrix c is called a "doubly correlated Wishart ensemble" [Til ] . 

Let us also mention that the auto-covariance matrix A can be estimated through a = (l/iV)Y T Y. However, it is 
verified that this object carries identical information to the one contained in c (it is "dual" to c), and therefore may 
safely be discarded. Indeed, these two estimators have same non-zero eigenvalues (modulo an overall rescaling by r), 
and the larger one has \T — N\ additional zero modes. 

Any historical estimator is inevitably marred by the measurement noise; it will reflect the true covariances only 
to a certain degree, with a superimposed broadening due to the finiteness of the time series. More precisely, there 
are N(N + l)/2 independent elements in C, to be estimated from NT measured quantities Y, hence the estimation 
accuracy will depend on the "rectangularity ratio," 



N 



(5) 



the closer r to zero, the more truthful the estimate. This is a cornerstone of classical multivariate analysis. Unfortu- 
nately, a practical situation will typically feature a large number of variables sampled over a comparably big number 
of time snapshots, so that we may approximately talk about the "thermodynamical limit," 

A" — > oo, T — > oo, such that r = fixed. (6) 

On the other hand, it is exactly this limit in which the FRV calculus (see the subsection below for its brief elucidation) 
can be applied; hence, the challenge of de-noising is somewhat counterbalanced by the computationally powerful FRV 
techniques. 
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B. A Short Introduction to the Free Random Variables Calculus: The Multiplication Algorithm 



1. The M -Transform and the Spectral Density 



Any study of a (real symmetric K x K) random matrix H will most surely include a fundamental question about 
the average values of its (real) eigenvalues Ai, . . . , Xk- They are concisely encoded in the "mean spectral density," 

^h(A) = ^ £ {5 (A - A,-)) = 1 (Tr (Al* - H)) . (7) 

i=l 

Here the expectation map (. . .) is understood to be taken w.r.t. the probability measure F(H)DH of the random 
matrix. We will always have this distribution rotationally (i.e., H — > O t HO, with O orthogonal) invariant, and 
hence the full information about H resides in its eigenvalues, distributed on average according to 

On the practical side, it is more convenient to work with either of the two equivalent objects, 

Gh{z) = 4 /tt— — -\ , or M H (z) = zGn{z) - 1, (8) 
A \ zl K - H/ 

referred to as the "Green's function" (or the "resolvent") and the "M-transform" of H. The latter is also called the 
"moments' generating function," since if the "moments" Mn.n = (l/A)(TrH™) of H exist, it can be expanded into a 
power series around z — > oo as Mn(z) = X)«>i Mn,n/z n . It should however be underlined that even for probability 
measures disallowing such an expansion (heavy-tailed distributions, preeminent in finance, being an example), the 
quantities (HJ still manage to entirely capture the spectral properties of H; hence the name "M-transform" more 
appropriate, in addition to being more compact. 

We will show that for our purposes (multiplication of random matrices; see par. Ill B 2[) the M-transform serves 
better than the Green's function. However, it is customary to write the relationship between ([7]) and © in terms of 
this latter, 

Ph(A) = -- lim ImGWA + ie) = --^r lim (G H (A + ie) - G H (A - ie)) . (9) 

7T e->0+ Z7T1 e-i-0+ 

resulting from a well-known formula for generalized functions, lim e _> + l/(x ± ie) = pv(l/a;) =F iirS(x). 



2. The N -Transform and Free Random Variables 



The doubly correlated Wishart ensemble c Q may be viewed as a product of several random and non-random 
matrices. The general problem of multiplying random matrices seems formidable. In classical probability theory, it 
can be effectively handled in the special situation when the random terms are independent: then, the exponential 
map reduces it to the addition problem of independent random numbers, solved by considering the logarithm of 
the characteristic functions of the respective PDFs, which proves to be additive. In matrix probability theory, a 
crucial insight came from D. Voiculescu and coworkers and R. Spcicher 0, Q, who showed how to parallel the 
commutative construction in the noncommutativc world. It starts with the notion of "freeness," which basically 
comprises probabilistic independence together with a lack of any directional correlation between two random matrices. 
This nontrivial new property happens to be the right extension of classical independence, as it allows for an efficient 
algorithm of multiplying free random variables (FRV), which we state below: 

Step 1: Suppose we have two random matrices, Hi and H2, mutually free. Their spectral properties are best wrought 
into the M-transforms ©, Mh 1 (z) and Mn 2 (z). 

Step 2: The critical maneuver is to turn attention to the functional inverses of these M-transforms, the so-called 
" A-transforms , " 

M H (N h (z)) = Na (M H (z)) = z. (10) 
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Step 3: The TV-transforms submit to a very straightforward rule upon multiplying free random matrices (the "FRV 
multiplication law" ) , 

N Hl n 2 (z) = j^Nh^Nh^z), for free Hi, H 2 . (11) 

Step 4: Finally, it remains to functionally invert the resulting ^-transform A^h 1 h 2 (^) to gain the M-transform of 
the product, MhiH 2 ( z )j an d consequently, all the spectral properties via formula ©• 

It is stunning that such a simple prescription (relying on the choice of the M-transform as the carrier of the mean 
spectral information, and the construction of its functional inverse, the iV-transform, which essentially multiplies 
under taking the free product) resolves the multiplication problem for free random noncommutative objects. 

Let us just mention that the addition problem may be tackled along similar lines: In this case, the Green's 
function should be exploited, its functional inverse considered {Gn{Bn(z)) = Bn{Gn{z)) = z; it is sometimes called 
the "Blue's function" [l2|, 0), which obeys the "FRV addition law," Bh 1+ h 2 (z) = B ai (z) + Bn 2 (z) — 1/z, for two 
free random matrices. In this paper, we do not resort to using this addition formula, even though our problem could 
be approached through it as well. 

Let us also remark that in the original mathematical formulations of these frames, a slightly different language 
is employed: Instead of the iV-transform, the "5-transform" is found convenient, Sn(z) = (1 + z)/(zNn(z)), while 
in place of the Blue's function, one engages the "_R-transform," Rn(z) = Bn(z) — 1/z. They fulfil simpler laws, 
»5h 1 h 2 (^) = S-n 1 (z)S t i 2 (z) and i2 Hl +R 2 (z) = + Yn 2 (z), respectively. 



3. Doubly Correlated Wishart Ensembles from Free Random Variables 



The innate potential of the FRV multiplication ^algorithm (fTT|) is surely revealed when inspecting the doubly 
correlated Wishart random matrix c = (1/T)\/CYAY T \/C This has been done in detain in [10], so we will only 
accentuate the main results here, referring the reader to the original paper for a thorough explanation. 

The idea is that one uses twice the cyclic property of the trace (which permits cyclic shifts in the order of the 
terms), and twice the FRV multiplication law pT|) (to break the iV-transforms of products of matrices down to their 
constituents), in order to reduce the problem to solving the uncorrelated Wishart ensemble (1/T)Y T Y. This last 
model is further simplified, again by the cyclic property and the FRV multiplication rule applied once, to the standard 
GOE random matrix squared (and the projector P = diag(lAr, 0t-jv)j designed to chip the rectangle Y off the square 
GOE), whose properties are firmly established. Let us sketch the derivation, 

cyclic FRV cyclic 

N c (z) 4 N^ A ^ TC (z) i -l- N ^ A ~ T ( z )N c (z) ± 

cyclic FRV 

Z Z TZ 

= t— N^ T ^ A (rz)N c (z) 4 — — - — — N ± ^(rz)N A (rz)N c (z) = 
1 + Z ? 1 + zl + rzT 11 

= rzN A {rz)N c (z). (12) 

This is the basic formula. Since the spectral properties of c are given by its M-transform, M = M c (z), it is more 
pedagogical to recast (|T2l) as an equation for the unknown M, 

z = rMN A (rM)N c (M). (13) 

It provides a means for computing the mean spectral density of a doubly correlated Wishart random matrix once the 
"true" covariance matrices C and A are given. 

In this communication, only a particular instance of this fundamental formula is applied, namely with an arbitrary 
auto-covariance matrix A, but with trivial cross-covariances, C = ljy- Using that N 1k (z) = 1 + 1/z, equation (fTS")) 
thins out to 

rM - MA (^7))' (M) 

which will be strongly exploited below. Let us mention that these equalities (|13j) . (|14[) have been derived through 
other, more tedious, techniques (the planar Feynman-diagrammatic expansion, the replica trick) in fl4l - fl8j . 
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III. VARMA FROM FREE RANDOM VARIABLES 



In what follows, we will assume that the VMA(g), VAR(g), or VARMA(gi, 92) stochastic processes are covariance 
(weak) stationary; for details, we refer to fl9j . It implies certain restrictions on their parameters, but wc will not 
bother with this issue in the current work. Another consequence is that the processes display some interesting features, 
such as invcrtibility. 

For all this, we must in particular take both TV and T large from the start, with their ratio r = N/T fixed ©. 
More precisely, we stretch the range of the a-index from minus to plus infinity. This means that all the finite-size 
effects (appearing at the ends of the time series) are readily disregarded. In particular, there is no need to care about 
initial conditions for the processes, and all the recurrence relations arc assumed to continue to the infinite past. 



A. The VMA(q) Process 



1. The Definition of 'VMA(q) 



We consider a situation when N stochastic variables evolve according to identical independent VMA(g) (vector 
moving average) processes, which we sample over a time span of T moments. This is a simple generalization of the 
standard univariate weak-stationary moving average MA(q). In such a setting, the value Yi a of the i-th (i = 1, . . . , N) 
random variable at time moment a (a = 1, . . . , T) can be expressed as 

<? 

Here all the eta's are IID standard (mean zero, variance one) Gaussian random numbers (white noise), (ei a £jb) = &ij&ab- 
The a a 's are some (q + 1) real constants; importantly, they do not depend on the index i, which reflects the fact that 
the processes arc identical and independent (no "spatial" covariances among the variables). The rank q of the process 
is a positive integer. 



2. The Auto-Covariance Matrix 



In order to handle such a process (fT5|) . notice that the Yi a 's, being linear combinations of uncorrelatcd Gaussian 
numbers, must also be Gaussian random variables, albeit displaying some correlations. Therefore, to fully characterize 
these variables, it is sufficient to calculate their two-point covariance function; this is straightforwardly done (see 
appendix Q] for details), 



(Y ia Y jb ) = SijA™, (16) 



where 



q q-d 

A ib = K o S ab + (fia.b-d + S a ,b+d) , with = a a a a+d , d = 0,l,...,q. (17) 

d=l a=0 

In other words, the cross-covariance matrix is trivial, C = ljv (no correlations between different variables), while the 
auto-covariance matrix A^ 1 ), responsible for temporal correlations, can be called u (2q + l)-diagonal." In the course 
of this article, we will use several different auto-covariance matrices, and for brevity, we decide to label them with 
superscripts; their definitions are all collected in appendix [2] 

For example, in the simplest case of VMA(l), it is tri-diagonal, 

A ab = ( fl + a l) S ab + ClQai (<5a,6-l + ^a,h+l) • (18) 
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3. The Fourier Transform and the M -Transform of the Auto-Covariance Matrix 

Such an infinite matrix (|17p is translationally invariant (as announced, it is one of the implications of the weak 
stationarity), i.e., the value of any of its entries depends only on the distance between its indices, 4v = A^(a — b); 
specifically, A^(±d) = for d = 0, 1, . . . , q, and j4W(|d| > q) = 0. Hence, it is convenient to rewrite this matrix 
in the Fourier space, 



A {1 Hp) = ^2e idp A^(d) = k { o> +2^4 cos(dp). (19) 

dSZ d=l 

In this representation, the Af -transform of A' 1 ) is readily obtained (loj . 



M A (,i)(z) = — d P =7^ ■ 



(20) 



This integral can be evaluated by the method of residues for any value of q, which we do in appendix [2J where also 
we print the general result (|42[) . In particular, for q = 1, 

M AW {z) = -= Z a -l, (21) 

y z - (ao + ai) y z- (ao - ai) 

where the square roots are principal. 



4- The Pearson Estimator of the Covariances from Free Random Variables 

We will be interested^ in investigating the spectral properties of the Pearson estimator 
c = (1/T)YY T = (1/T)YAWY t @. The M-transform of this correlated Wishart random matrix, M = M c (z), 
can be retrieved from equation (|14[) . We could write it for any q using (|42p . but we will restrict ourselves only to 
q = 1, in which case the substitution of (|2T|) leads to a fourth-order polynomial (Ferrari) equation for the unknown 

M, 

r A (a 2 - af) 2 M 4 + 2r 3 (- (ag + a\) z + (a 2 - a\) 2 (r + 1)) M 3 + 
+r 2 [z, 2 - 2 (al + al) (r + 2)z + (a 2 - a\) 2 (r 2 + 4r + l)) M 2 + 

+ 2r [z 1 - (a\ + af) (2r + l)z + (a 2 - a 2 ) 2 r(r + 1)) M + r (-2 (a 2 + a 2 ) z + (a 2 - a 2 ) 2 r) = 0. (22) 

The FRV technique allowed us therefore to find this equation in a matter of a few lines of a simple algebraic compu- 
tation. It has already been derived in Q, and (|22p may be verified to coincide with the version given in that paper. 
In the pertinent equation is printed before (A. 6), and to compare the two, one needs to change their variables 
into ours according to y —t 1/r, x — > z/r, and m —> — r(l + M)/z. The last equality means that m and to of @ 
correspond in our language to the Green's functions —rG c (z) and —G & (z/r), respectively, where a. — (l/iV)Y T Y is 
the Pearson estimator dual to c. As mentioned, a quick extension to the case of arbitrary q is possible, however 
the resulting equations for M will be significantly more complicated; for instance, for q = 2, a lengthy ninth-order 
polynomial equation is discovered. 
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B. The VAR(g) Process 



1. The Definition of VAR(g) 



A set-up of N identical and independent VAR(g) (vector auto-regressive) processes is somewhat akin to (|15|). 
i.e., we consider N decoupled copies of a standard univariate AR(q) process, 

9 

Y ia - ^2 bpYi^-p = a e ia . (23) 

/3=1 

It is again described by the demeaned and standardized Gaussian white noise €i a (which triggers the stochastic 
evolution), as well as (q + 1) real constants ao, bp, with (3 — I, ... ,q. As announced before, the time stretches to 
the past infinity, so no initial condition is necessary. Although at first sight (|23[) may appear to be a more involved 
recurrence relation for the Yi a 's, it is actually easily reduced to the VMA(g) case: It remains to remark that if 
one exchanges the Y^'s with the e, 's, one precisely arrives at the VMA(g) process with the constants = 1/ao, 
a"? = —bp/a a , /3 = 1, . . .,q. In other words, the auto-covariance matrix A^ 3 ) of the VAR(g) process (p?3"j) is simply 
the inverse of the auto-covariance matrix A^ 2 ) of the corresponding VMA(g) process with the described modification 
of the parameters, 

A< 3 > = (A< 2 >)~\ (24) 
This inverse exists thanks to the weak stationarity supposition. 



2. The Fourier Transform and the M -Transform of the Auto-G'ovariance Matrix 



The Fourier transform of the auto-covariance matrix A^ 3 * 1 of VAR(q) is therefore a (number) inverse of its 
counterpart for VMA(g) with its parameters appropriately changed, 

A(3)(p) = ^J_= , (25) 

AC0(p) 4 ) +2E3=i4 ) «»(*) 



where 

q-d 

n d 2) = — ^2 b a b a +d, d = Q,l,...,q, (26) 



U Q=0 

and where we define bo = — 1. 

In order to find the M-transform of the inverse matrix, A^ 3 ^ = (A^ 2 -*) -1 , one employs a general result, true for 
any (real symmetric) random matrix H, and obtainable through an easy algebra, 

M H -i(«) = -M H (l/z)-l. (27) 

Since the quantity Af A < 2 ) (z) is known for any q (|42p . hence is M A ( 3 )(z) via (|27[) , but we will not print it explicitly. 
Let us just give it for q = 1, in which case (|27p and (f2~Tj) yield 



3. The Auto-Covariance Matrix 



Despite being somewhat outside of the main line of thought of this article, an interesting question would be to 
search for an explicit expression for the auto-covariance matrix A 1 - 3 ) from its Fourier transform (|25[) . 



A {3) (d) = ±- f dpe" 1 *-^ * — , (29) 
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where we exploited the fact that A^ 3 ' must be translationally invariant, A^J = A^(a — b). This computation would 
shed light on the structure of temporal correlations present in a VAR setting. 

This integral is evaluated by the method of residues in a very similar manner to the one shown in appendix [21 
and we do this in appendix [5] We discover that the auto-covariance matrix is a sum of q exponential decays, 

A^(d) = Y,C 7 e-^, (30) 

7=1 

where C 7 are constants, and T 7 are the characteristic times (|44[) , 7 = 1, ...,<?; these constituents are given explicitly 
in (|43j) . This is a well-known fact, nevertheless we wanted to establish it again within our approach. 

For example, for q = I, the auto-covariance matrix of VAR(l) is one exponential decay, 

A (3) (d) = _^i_ & wi (31) 

where we assumed for simplicity < b\ < 1 (the formula can be easily extended to all values of bi). 



4- The Pearson Estimator of the Covariances from Free Random Variables 

Having found an expression for the M-transform of the auto-covariance matrix A' 3 ) of a VAR(g) (|27[) . (|4"!2)) , we 
may proceed to investigate the^equation (fT4"]) for the M-transform M = M c (z) of the correlated Wishart random 
matrix c = (1/T)YY T = (1/T)YA' 3 'Y t (@J). We will do this explicitly only for q = 1, when fll§ leads to a fourth- 
order (Ferrari) polynomial equation for the unknown M, 

a^r 2 M 4 + 2a^r (- (l + b\) z + a^r) M 3 + 

+ ((1 - b\) 2 z 2 - 2a 2 Q r (1 + b 2 ) z + (r 2 - l) a 4 ) M 2 - 2a%M - aj| = 0. (32) 

This equation has been derived by another method in Q, and our result confirms their equation (A. 8), with the 
change in notation, y — > 1/r, x — > z/r, z — > rM. 



C. The VARMA(gi,<j 2 ) Process 
1. The Definition of VARMA(qi , 172) 

The two types of processes which we elaborated on above, VAR(qi) and VMA(g2), can be combined into one 
stochastic process called VARMA(<7i, 52), 

91 92 

Y ia - bpY i>a -0 = a a e ita - a . (33) 

Now it is a straightforward and well-known observation (which can be verified by a direct calculation) that the auto- 
covariance matrix A^ 5 * 1 of this process is simply the product (in any order) of the auto-covariance matrices of the 
VAR and VMA pieces; more precisely, 

A (S) = J A (4))" 1 A (D ) (34) 

where A^ 1 ) corresponds to the generic VMA(g2) model (fT?)) . while A^ 4 ' denotes the auto-covariance matrix of 
VMA(gi) with a slightly different modification of the parameters compared to the previously used, namely a^ 4 ' = 1, 
al = —bp, for j3 = 1, . . . , q\. We have thus already made use here of the fact that the auto-covariance matrix of a 
VAR process is the inverse of the auto-covariance matrix of a certain corresponding VMA process ([24j) . but the new 
change in parameters necessary in moving from VAR to VMA has effectively eto = 1 w.r.t. what we had before (|24[k 
it is understandable: this "missing" ag is now included in the matrix of the other VMA(g2) process. 
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FIG. 1: The mean spectral density p c (A) of the Pearson estimator c of the cross-covariances in the VARMA(1, 1) process 
computed numerically from the sixth-order polynomial equation (|45[) , for various values of the process' parameters. The scale 
of these parameters is determined by choosing ao — 1 everywhere. Recall that the theoretical formula (|45[) is valid in the 
thermodynamical limit ([6]) of N, T — s> oo, with r = N/T kept finite. 

UP LEFT: We set the remaining VARMA parameters to a\ = 0.3, b\ — 0.2, while the rectangularity ratio takes the values 
r — 0.5 (the purple line), 0.1 (red), 0.02 (magenta), 0.004 (pink); each one is 5 times smaller than the preceding one. We 
observe how the graphs become increasingly peaked (narrower and taller) around A = 1 as r decreases, which reflects the 
movement of the estimator c toward its underlying value C = ljv. 

UP RIGHT: We fix r = 0.25 and consider the two VARMA parameters equal to each other, with the values ai = bi = 0.6 
(purple), 0.4 (red), 0.2 (magenta), 0.01 (pink). 

DOWN LEFT: We hold r = 0.25 and 6i = 0.2, and modify ai = 0.6 (purple), 0.4 (red), 0.2 (magenta), 0.0 (pink); for this last 
value, the VARMA(1, 1) model reduces to VAR(l). 

DOWN RIGHT: Similarly, but this time we assign r = 0.25 and ai = 0.2, while changing 6i = 0.6 (purple), 0.4 (red), 0.2 
(magenta), 0.0 (pink); this last value corresponds to VMA(l). 



2. The Fourier Transform and the M -Transform of the Auto-Covariance Matrix 



The Fourier transform of the auto-covariance matrix A^ 5 ) of VARMA(gi, (fe) (|34|) is simply the product of the 
respective Fourier transforms (flU]) and ([25j) . 



«o + 2 EiU K V cos(d lP ) 



2 cos (d„p) 



where 



3i— di Q2 — d,2 

Z dl = kai&ai+di) K d 2 = ^2 a ^2 a a 2 J rd 2 -> = 0? 1) • * • 5 Ql-> ^2 = 0, 1, . . . , q 2 , (36) 

ai= a.2— 
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FIG. 2: Monte Carlo simulations of the mean spectral density p c (A) (the green plots) compared to the theoretical result 
obtained numerically from the sixth-order equation (|45p (the dashed red lines). The conformity is nearly perfect. We generate 
the matrices Y of sizes N = 50, T = 200 (i.e., r — 0.25) from the VARMA(1, 1) process with the parameters ao = 1, oi = 0.3, 
bi = 0.2. The Monte Carlo simulation is repeated 1,000 (LEFT) or 10,000 (RIGHT) times; in this latter case, a significant 
improvement in the quality of the agreement is seen. One notices finite-size effects at the edges of the spectrum ("leaking 
out" of eigenvalues): in the numerical simulations, N and T are obviously finite, while equation (|45p is legitimate in the 
thermodynamical limit only, hence the small discrepancies; by enlarging the chosen dimensions 50 x 200 one would diminish 
this fallout. 



where we recall bo = — 1. For instance, for VARMA(1, 1) (it is described by three constants, a , a±, b\), one explicitly 
has 

^ _ a 2 + a 2 1 +2a a 1 cosp 

A {P) l + bj-2b lC osp ■ (S7 > 

The M-transform of A^ 5 ' can consequently be derived from the general formula (|20[) . We will evaluate here the 
pertinent integral only for the simplest VARMA(1, 1) process, even though an arbitrary case may be handled by the 
technique of residues, 

M A(V ( z ) = —7— ~ a o a i + , ; I ' ( 38 ) 

a ° ai + blZ V V(l - 6i) 2 z - (a + ai f^(l + hf z - (a - aif 



3. The Auto-Covariance Matrix 



One might again attempt to track the structure of temporal covariances in a VARMA process. This can be done 
either by the inverse Fourier transform of (|35[) . or through a direct computation based on the recurrence relation 
([551) (importantly, adhering to the assumption that it stretches to the past infinity). Let us print the result just for 
VARMA(1,1), 

A (d)--—5 d>0 + Ml _ 6 2 } 61. (39) 

where for simplicity < 61 < 1. This is an exponential decay, with the characteristic time of the VAR piece, with an 
additional term on the diagonal. 



4- The Pearson Estimator of the Covariances from Free Random Variables 



Expression (|38j) . along with the fundamental FRV formula (|14|) . allowjis to write the equation satisfied by the M- 
transform M = M c (z) of the Pearson estimator c = (1/T)YY T = (1/T)YA< 5 )Y T © of the cross-covariances in the 
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VARMA(1, 1) process; it happens to be polynomial of order six, and we print it P5j) in appendix l"5l It may be solved 
numerically, a proper solution chosen (the one which leads to a sensible density: real, positive-definite, normalized to 
unity), and finally, the mean spectral density /0 C (A) derived from Q. We show the shapes of this density for a variety 
of the values of the parameters r, do, ai, b\ in fig. [T] Moreover, in order to test the result ([35)1 . and more broadly, to 
further establish our FRV framework in the guise of formula (|14p , the theoretical form of the density is compared to 
Monte Carlo simulations in fig. [2j they remain in excellent concord. These are the main findings of this article. 



IV. CONCLUSIONS 

In this paper we attempted to advertise the power and flexibility of the Free Random Variables calculus for 
multivariate stochastic processes of the VARMA type. The FRV calculus is ideally suited for multidimensional time 
series problems, provided the dimensions of the underlying matrices are large. The operational procedures are simple, 
algebraic and transparent. The structure of the final formula which relates the moments' generating function of 
the population covariance and the sample covariance allows one to easily derive eigenvalue density of the sample 
covariance. We in detail illustrated how this procedure works for VARMA(1, 1), confronted the theoretical prediction 
with numerical data obtained by Monte Carlo simulations of the VARMA process and observed a perfect agreement. 

The FRV calculus is not restricted to Gaussian variables. It also works for non-Gaussian processes, including 
those with heavy-tailed increments belonging to the Levy basin of attraction, where the moments do not exist. 
Since the majority of data collected nowadays is naturally stored in the form of huge matrices, we believe that the 
FRV technique is the most natural candidate for the matrix-valued "probability calculus" that can provide efficient 
algorithms for cleaning (de-noising) large sets of data and unraveling essential but hidden correlations. 
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Appendices 

1. The Auto-Covariance Matrix for VMA(g) 

In this appendix, we sketch a proof of the formula (jTT)) for the auto-covariance matrix of the VMA(g) process. 
As mentioned, since the random variables are centered Gaussian, this matrix alone suffices to completely capture all 
their properties. We set i = j; the dependence on this index may be dropped as there are no correlations here. We 
use the definition p5[) of VMA(g), as well as the auto-covariance structure of the white noise, (ci a ejb) = 6ijS a b- This 
leads to 

9 9 9 9 

^ab = (XiaXib) = J! X] aa<1 f 3 ( e i,a-a e «,b-/3) = X] X] a a a (3&a-u,b- = 

a=0 /3=0 a=0 0=0 

The double sum is symmetrized, the index ft replaced by d = f3 — a, 

j 9 9-« 

• • • = 2 X X aa(la + d ($b,a+d + Sb.a-d) = ■ ■ ■ , 

a=0 d=—o 

and the order of the sums interchanged (an elegant method for this is explained in (20l|). 



9 / 9-min(0,rf) 
- ^ ( a a a a+d I (Sb,a+d + ^b,a- 



2 

d= — q \a=m&x(0, — d) 



which, upon splitting the sum over d into three pieces (from — q to — 1, d = 0, and from 1 to q), is quickly seen to 
coincide with (flTl) . 
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2. A List of the Various Auto— Covariance Matrices Used 



For the reader's convenience, let us collect in this appendix the five auto-covariance matrices which are defined 
throughout this paper: 

• By A^ 1 ) we denote the auto-covariance matrix of the VMA(q) process with the generic constants a a , with 
a = 0, 1, . . . , q, as defined in (Tl5l) . 

• By A' 2 ) we denote the auto-covariance matrix of the VMA(g) process with the constants = 1/ao, 
ay = -bp/a , where j3 = 1, . . . , q. 

• By A^ 3 ) we denote the auto-covariance matrix of the VARf^) process with the generic constants ao, bp, with 
P = 1, ... ,q, as defined in (g3|. There holds = (A^))- 1 ([2% 

• By A^ 4 ) we denote the auto-covariance matrix of the VMA(gi) process with the constants ag 4 ' = 1, a^' = —bp, 
where j3 — 1, . . . , q±. 

• By A^ 5 ) we denote the auto-covariance matrix of the VARMA(<?i, q%) process with the generic constants bp, 
a a , with p = l,...,qi and a = 0, 1, . . . , q 2 , according to the definition ([35]). There is A (5) = (A (4) ) _1 A (1) (f34|) . 
where in the latter piece q = (72 • 



3. The M-Transform of the Auto-Covariance Matrix for VMA(q) 



We will derive here the M-transform (|20[) of the auto-covariance matrix A^ of an arbitrary VMA(g) process, 
using the expression for its Fourier transform (|19[) . It is a little simpler to consider the Green's function, 

i + m a(1) (s) 1 r 1 r , n . 

Gkw\ z ) = = -/ d P =7- — . ( 40 ) 

z tt Jo z-AV>(p) 

where the integration range has been halved due to the evenness of the integrand. 

This inte gral is performed with help of the change of variables y = 2 cos p. The measure, when p G [0,tt], reads 
dp = —dy/y/4 — y 2 . A basic observation is that the denominator of the integrand is a linear combination of cos(dp), 
for d = 1, . . . ,q, and each such a cosine can be cast as a polynomial of order d in y through the de Moivre formula. 
Hence, the denominator is a polynomial of order q in y, 

1 <? 

aw ( P )-z = 4 1} - z + 2 4 ] cos(dp) = n (» - yp) > ( 41 ) 

d=l p=l 

where the yp's are the q roots (which we assume to be single), and ip is the coefficient at y q . Using the method of 
residues, one readily finds 

1 1 f 2 1 1 1 J 1 ■> 1 1 

Ga<1) {z) = ~-i> I-2 d V4-? 2 \[u^y-vp) = * S niu ^-^^-2^7+2' (42) 

/3^7 

where the two square roots on the r.h.s. are principal. This is an explicit formula for the Green's function of A^ 1 ', 
provided one has factorized the order-g polynomial (|41[) . 



4. The Auto-Covariance Matrix for VAR(g) 



Let us argue now that the Fourier transform ([25| leads to the auto-covariance matrix of VAR(g) (|29j) of the form 
of a sum of exponential decays (|30[) . and let us give precise expressions for the constants C 7 and the characteristic 
times T 7 , 7 = 1, . . . , q. 
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We proceed by the technique of residues, analogously to appendixQH however this time with aid of another variable, 
x = e~ lp , related to the previously used through y = 2cosp = x + l/x. The integration measure is dp = idx/x, 
and the integration path is counterclockwise around the centered unit circle. The denominator of the integrand 
is a polynomial of order q in the variable y, having thus some q roots yp, (3 = l,...,q. Therefore, there are 2q 
corresponding solutions for the variable x, with a half of them inside the integration path and a half outside; let xp 
be the solutions to x + l/x — yp with the absolute values less than 1. Only them contribute to the integral, and their 
residues straightforwardly give 

A^(d) = jfl ^ (43 ) 

This is indeed q exponents (i 7 )' d ', 7 = 1, ... , q. Remark that the solutions may be complex, hence this is really q 
different exponential decays exp(— \d\/Tj), with the characteristic times 

log|x 7 | 

(these times are positive as the roots have the absolute values less than 1), possibly modulated by sinusoidal oscillations 
when a root has an imaginary part. 



For example, for q = 1 there is one exponential decay (|31j). while for q = 2, one obtains either two exponential 
decays (the two roots are real and different), or one exponential decay modulated by oscillations (the two roots are 
complex and mutually conjugate), etc. 



5. The Equation for the M— Transform of the Pearson Estimator of the Covariances for VARMA(1, 1) 
The sixth-order polynomial equation obeyed by Me M c (z) in the case of VARMA(1, 1) reads, 

422/2 2\ 2 ii,r6 , 

r a a 1 [a — a 1 ) M + 

+2r 3 a ai ^ (a£ - Baga 2 + af) h - a a 1 (ag + af) (b\ + l) + (1 + 2r) a ai (a 2 , - a 2 ) 2 j M 5 + 

+r 2 (( («o ~ 20a o a i + «i) b l - 4 «oai (4 + a 2 ) b x (b\ + l)+ a 2 al (b\ + l) ^ z 2 + 
+2a oai ( ((1 + 3r) (4 + af) - 2 (5 + 9r) aga 2 ) 61 - (2 + 3r) a ai (a 2 + a?) (b 2 + l) \ z+ 
+ (1 + 8r + 6r 2 ) a 2 al (a 2 - a?) 2 j M 4 + 

+2r ^61 ^ - 6a aife 2 - (a 2 , + a?) 61 (6? + l) + a a x (b\ + l)^z 3 + 
(-10 (1 + 2r) a 2 a 2 +r(a% + af)) b 2 -2{l + 2r) a ai (a 2 + a 2 ) h (b 2 + l) + (1 + r) a 2 a 2 (bj + l) J z 2 + 
( (3r (1 + r) (ag + a 4 ) - 2 (2 + 15r + 9r 2 ) a 2 a 2 ) 61 - (l + 6r + 3r 2 ) a ai (a 2 + a\) (bj + l) J z+ 



-aoai 
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+ 2r(l + 3r + r 2 )a 2 a 2 (a 2 l -a 2 ) j M 3 + 
+ ( b\ (l - blf z A + 26i ^ - 2 (1 + 3r) a ai6? - r (a 2 + a?) (b 2 + l) + (1 + r) a oai (&f + l) ^ z 3 + 

+ (^- ((1-r 2 ) (og + at) + 2 (3 + 20r + 10r 2 ) a 2 a\) 6 2 - 
- 2 (l + 4r + 2r 2 ) a ai (a§ + a 2 ) &i (b 2 + l) + r (4 + r) a 2 a 2 (&f + l) ^ z 2 + 
+2ra Qai ( (r (3 + r) (ajj + a?) - 6 (2 + 5r + r 2 ) a^a 2 ) 6 X - (3 + 6r + r 2 ) a ai (a 2 , + a 2 ) (fe 2 + 1) J 2+ 

+ r 2 (6 + 8r + r 2 ) a 2 a 2 (a 2 - a 2 ) 2 j A/ 2 + 

+2^a ai6i (l - b\) 2 z 3 + 

+ ( - (4 + af + 2 (3 + 5r) a 2 a 2 ) 6 2 - 2 (1 + r) a ai (a 2 + a\) b x (bj + l) + ra 2 a 2 (6} + l) \ z 2 + 
+ra ai ( (r (a% + of) - 2 (6 + 5r) aga 2 ) &i - (3 + 2r) a ai (og + a 2 ) (bj + l) J z+ 

+ r 2 (2 + r) a 2 a 2 (a 2 - a 2 ) 2 M- 

-6i ^ (oq + 6a^a 2 + of) &i + 2a ai (a 2 , + a 2 ) (6 2 + l) J z 2 - 

- 2rala 2 1 ^4a Q a 1 b 1 + (a 2 + a 2 ) (& 2 + l) ^ 2 + r 2 a 2 a 2 (a 2 - a 2 ) 2 =0. (45) 
This equation in a Mathematica file can be obtained from the authors upon request. 



[1] Sims C. A., Macroeconomics and reality, Econometrica 48 (1980) 1. 

[2] Smets F., Wouters R., An Estimated Stochastic Dynamic General Equilibrium Model of the Euro Area, European Central 

Bank Working Paper Series, working paper 171, August 2002 [http://www.ecb.int]. 
[3] Box G., Jenkins G. M., Reinsel G., Time Series Analysis: Forecasting and Control, 3rd ed., Prentice-Hall, Englewood 



16 



[4] Laloux L., Cizeau P., Bouchaud J.-P., Potters M., Noise Dressing of Financial Correlation Matrices, Phys. Rev. Lett. 83 

(1999) 1467 [arXiv:cond-mat/9810255]. 
[5] Plerou V., Gopikrishnan P., Rosenow B., Amaral L. A. N., Stanley H. E., Universal and non-universal properties of 

cross-correlations in financial time series, Phys. Rev. Lett. 83 (1999) 1471 [arXiv : cond-mat/9902283]. 
[6] Jin B., Wang C, Miao B., Huang M.-N. L., Limiting spectral distribution of large-dimensional sample covariance matrices 

generated by VARMA, Journal of Multivariate Analysis 100 (2009) 2112. 
[7] Bai Z. D., Silverstein J. W., Spectral Analysis of Large Dimensional Random Matrices, Science Press, Beijing, 2006. 
[8] Voiculescu D. V., Dykema K. J., Nica A., Free Random Variables, CRM Monograph Series, Vol. 1, Am. Math. Soc, 

Providence, 1992. 

[9] Speicher R., Multiplicative functions on the lattice of non-crossing partitions and free convolution, Math. Ann. 298 (1994) 
611. 

[10] Burda Z., Jarosz A., Jurkiewicz J., Nowak M. A., Papp G., Zahed I., Applying Free Random Variables to Random Matrix 

Analysis of Financial Data, submitted to Quantitative Finance. 
[11] Wishart J., The Generalized Product Moment Distribution in Samples from a Normal Multivariate Population, Biometrika 

A 20 (1928) 32. 

[12] Zee A., Law of addition in random matrix theory, Nucl. Phys. B 474 (1996) 726 [arXiv: cond-mat/9602146]. 
[13] Janik R. A., Nowak M. A., Papp G., Zahed I., Various Shades of Blue's Functions, Acta Phys. Polon. B 28 (1997) 2949 
[arXiv : hep-th/9710103] . 

[14] Burda Z., Gorlich A., Jarosz A., Jurkiewicz J., Signal and Noise in Correlation Matrix, Physica A 343 (2004) 295 
[arXiv : cond-mat/0305627]. 

[15] Burda Z., Jurkiewicz J., Signal and Noise in Financial Correlation Matrices, Physica A 344 (2004) 67 
[arXiv : cond-mat/0312496] . 

[16] Burda Z., Jurkiewicz J., Waclaw B., Spectral Moments of Correlated Wishart Matrices, Phys. Rev. E 71 (2005) 026111 
[arXiv : cond-mat/0405263] . 

[17] Burda Z., Gorlich A., Jurkiewicz J., Waclaw B., Correlated Wishart Matrices and Critical Horizons, Eur. Phys. J. B 49 

(2006) 319 [arXiv:cond-mat/050834l]. 
[18] Burda Z., Jurkiewicz J., Waclaw B., Eigenvalue density of empirical covariance matrix for correlated samples, Acta Phys. 

Pol. B 36 (2005) 2641 [arXiv: cond-mat/0508451]. 
[19] Liitkepohl H., New Introduction to Multiple Time Series Analysis, Springer Verlag, Berlin, 2005. 

[20] Graham R., Knuth D., Patashnik O., Concrete Mathematics: A Foundation for Computer Science, Addison-Wesley, 1994. 



